Do Individualized Patient-Specific Situations Predict the Progression Rate and Fate of Knee Osteoarthritis? Prediction of Knee Osteoarthritis

Factors affecting the progression rate and fate of osteoarthritis need to be analyzed when considering patient-specific situation. This study aimed to identify the rate of remarkable progression and fate of primary knee osteoarthritis based on patient-specific situations. Between May 2003 and May 2019, 83,280 patients with knee pain were recruited for this study from the clinical data warehouse. Finally, 2492 knees with pain that were followed up for more than one year were analyzed. For analyzing affecting factors, patient-specific information was categorized and classified as demographic, radiologic, social, comorbidity disorders, and surgical intervention data. The degree of contribution of factors to the progression rate and the fate of osteoarthritis was analyzed. Bone mineral density (BMD), Kellgren–Lawrence (K–L) grade, and physical occupational demands were major contributors to the progression rate of osteoarthritis. Hypertension, initial K–L grade, and physical occupational demands were major contributors to the outcome of osteoarthritis. The progression rate and fate of osteoarthritis were mostly affected by the initial K–L grade and physical occupational demands. Patients who underwent surgical intervention for less than five years had the highest proportion of initial K–L grade 2 (49.0%) and occupations with high physical demand (41.3%). In identifying several contributing factors, the initial K–L grade and physical occupational demands were the most important factors. BMD and hypertension were also major contributors to the progression and fate of osteoarthritis, and the degree of contribution was lower compared to the two major factors.


Introduction
Osteoarthritis (OA) represents an important healthcare problem in an aging society because its incidence increases with aging. OA involves inflammation and major structural changes in the joints [1]. This results in irreversible damage to the joint cartilage and bony structures [2]. The knee joint is the most common site of OA, and it causes physical disability and chronic pain [3,4]. Globally, over 250 million people have symptomatic knee OA. Furthermore, the prevalence of symptomatic knee OA is particularly higher among Asians [5]. Despite the high incidence of knee OA, few drugs have been proven to effectively modify disease progression, and only a few treatment options have been proven to relieve symptoms. Considering the failure to find an effective disease-modifying OA drug (DMOAD) or nonsurgical treatment, it is necessary to prevent or delay the degenerative process in the absence of radiologic or mild OA with knee pain [6,7].
OA has a multifactorial etiology, which occurs because of the interaction between systemic and local factors. Some meta-analyses have shown that hypertension and diabetes

Materials and Methods
Institutional Review Board approval was obtained before performing the study. Between May 2003 and May 2019, 83,280 patients with knee pain were recruited for this study. Consistent data obtained from the clinical data warehouse (CDW) of a single institution were utilized accordingly. For patient-specific situations, data were categorized as follows: (1) demographic data, including age, sex, body mass index (BMI), and BMD; (2) radiological data including official readings of initial and final radiographs of the knee joint (the final radiograph was defined as the time of the last K-L grade or just before surgery); (3) social data including occupation; (4) comorbidity data including hypertension, diabetes mellitus, and others (tuberculosis, liver disease, cardiovascular disease, cancer, epilepsy, and kidney disease); and (5) surgical data including information about surgical interventions. The exclusion criteria were as follows: (1) patients who started follow-up for less than 10 years from the time of data acquisition (accordingly, patients who had visited before at least May 2010 were enrolled); (2) a conservative treatment period of less than one year; (3) initial K-L grade 3 or 4; and (4) history of lower extremity diagnosed with major trauma, including fracture, ligament, cartilage, and meniscal injury by the event. We excluded secondary OA for the prediction of primary OA progression. Finally, a total of 2492 knee data points (1678 patient data) were enrolled in this study.

Classification
Age was divided into three categories for even distribution (age < 55 years, 55 ≤ age < 65 years, and age ≥ 65 years). These divided categories of age were used for comparing according to K-L grade, the progression rate, and the fate of OA. The earliest BMD value was used for analysis, and when not performed, patients were considered to have missing values. BMD was divided into two categories: normal (T-score ≥ −1.0) and osteopenia or osteoporosis (T-score < −1.0). To check the remarkable change, the progression rate of OA was defined as the time elapsed for two or more K-L grade changes from no radiologic or mild OA.
The progression rate of OA was divided into three groups for even distribution (fast: time < 5 years; usual: 5 years ≤ time < 10 years; and slow: time ≥ 10 years). In terms of physical demand, occupation was classified based on the degree of kneeling, squatting, lifting, walking, climbing, and standing while referencing the relative risk of occupation to OA, as described by Palmer et al. [14,16]. We divided the physical demands of occupations into three categories: (1) low demand: white-collar workers (office workers), students, and jobless individuals; (2) mild demand: workers in the manufacturing industry, kitchen and serving staff, soldiers, and light manual workers; and (3) high demand: construction workers, factory workers, farmers, professional sports players, and hard laborers. Three categories of comorbidity factors were selected for analysis considering incidence: (1) hypertension, (2) diabetes mellitus, and (3) other disorders (one or more other comorbidities). The outcome of OA was defined as surgical or nonsurgical treatment. Osteotomy, unicompartmental knee arthroplasty (UKA), and total knee arthroplasty (TKA) were included as surgical interventions for knee OA. The fate of OA was classified as an elapsed time to surgical intervention. The elapsed time to surgical intervention was divided into two groups based on the balance in the data (five-year basis). If there was no surgical intervention, it was included in the conservative treatment group. Descriptions were summarized in Table 1.

Affecting Factor Analysis
The analysis was built by a model for predicting the progression rate and the fate of OA. The degree of contribution of factors to the progression rate and the fate of OA was analyzed. The Scikit-learn library 0.23.2 in Python (version 3.6; Python Software Foundation, Wilmington, DE, USA) was used for our prediction model ( Figure 1). We used multinomial logistic regression for predicting the rate of progression and fate of OA. SoftMax was used as a classifier of multinomial logistic regression and cross-entropy loss was used for calculating error at training (difference of SoftMax with the true label, Appendix A).

Affecting Factor Analysis
The analysis was built by a model for predicting the progression rate and the fate of OA. The degree of contribution of factors to the progression rate and the fate of OA was analyzed. The Scikit-learn library 0.23.2 in Python (version 3.6; Python Software Foundation, Wilmington, DE, USA) was used for our prediction model ( Figure 1). We used multinomial logistic regression for predicting the rate of progression and fate of OA. SoftMax was used as a classifier of multinomial logistic regression and cross-entropy loss was used for calculating error at training (difference of SoftMax with the true label, Appendix A). For validation of the model, data were randomly partitioned into 70%:30%-sized groups during each iteration to ensure independence between the training data and test data. Furthermore, we iterated 100 times to analyze the validity of the models by calculating the 95% confidence intervals. Age and BMI were used as continuous variables of the prediction model. The categorical variables of the prediction model were classified as follows: (1) sex: male = 0, female = 1; (2) BMD: normal = 0, osteopenia or osteoporosis = 1; (3) physical occupational demands: low = 0, mild = 1, high = 2; and (4) comorbidities: absent = 0, present = 1. Confusion matrices were used to calculate the performance of the model ( Figure 2). Indicating numbers of Figure 2 were as follow: (a) time of K-L grade change more than 2, 0: time ≥10 years, 1: 5 years ≤ Time <10 years, 2: time <5 years, (b) time of surgical intervention, 0: conservative, 1: surgical intervention ≤5 years, 2: surgical intervention >5 years (maximum: 15 years).
The performance of the model was assessed by accuracy, precision, recall, F1-score, specificity, and error rate. The prediction model was built by using factors in Tables 2 and  3 and used to obtain the coefficient for estimating the effects of each factor. Coefficients were classified into two ranks by calculating the average of all coefficients of each category For validation of the model, data were randomly partitioned into 70%:30%-sized groups during each iteration to ensure independence between the training data and test data. Furthermore, we iterated 100 times to analyze the validity of the models by calculating the 95% confidence intervals. Age and BMI were used as continuous variables of the prediction model. The categorical variables of the prediction model were classified as follows: (1) sex: male = 0, female = 1; (2) BMD: normal = 0, osteopenia or osteoporosis = 1; (3) physical occupational demands: low = 0, mild = 1, high = 2; and (4) comorbidities: absent = 0, present = 1. Confusion matrices were used to calculate the performance of the model ( Figure 2). Indicating numbers of Figure  to determine the relative effect (average coefficient as a major contributor ≥ |0.211| and average coefficient as a minor contributor < |0.211|).

Statistical Analysis
Conventional statistical analyses were performed by using SPSS version 18.0. (IBM Corp., Armonk, NY, USA). Data descriptions were based on the means and standard deviations for continuous variables. The chi-square test was used to compare categorical variables (sex, BMD, occupation, and comorbidities). Analysis of variance (ANOVA) was The performance of the model was assessed by accuracy, precision, recall, F1-score, specificity, and error rate. The prediction model was built by using factors in Tables 2 and 3 and used to obtain the coefficient for estimating the effects of each factor. Coefficients were classified into two ranks by calculating the average of all coefficients of each category to determine the relative effect (average coefficient as a major contributor ≥ |0.211| and average coefficient as a minor contributor < |0.211|).

Statistical Analysis
Conventional statistical analyses were performed by using SPSS version 18.0. (IBM Corp., Armonk, NY, USA). Data descriptions were based on the means and standard deviations for continuous variables. The chi-square test was used to compare categorical variables (sex, BMD, occupation, and comorbidities). Analysis of variance (ANOVA) was performed to compare age, initial and final K-L grade, and BMI between the study groups. Tukey's honestly significant difference (Tukey HSD) was used for post-hoc analysis. The results were considered statistically significant when the p value was <0.05. The conventional statistical analysis was used to compare each factor according to K-L grade (Table 1), the progression rate (Table 2), and the fate of OA (Table 3).

Results
Patient demographics were arranged according to the initial K-L grade of no radiologic or mild OA ( Table 1). The mean age was the lowest in the initial K-L grade 0 group (57.6 ± 11.9, post hoc p < 0.001), and BMI was the lowest in the initial K-L grade 2 group (23.0 ± 2.8, post hoc p < 0.001). The initial K-L grade 2 group had the highest prevalence of osteopenia or osteoporosis (p < 0.001). The proportion of individuals with high physical demand occupations was the highest in the initial K-L grade 2 group (p < 0.001). The rate of comorbidity was the highest in the initial K-L grade 2 group. The highest K-L grade (3.8 ± 0.4) observed at the final follow-up visit was in the initial K-L grade 2 group (post hoc p < 0.001).
Comparison of affecting factors according to the progression rate of OA is summarized in Table 2. The fast-progression OA group tended to have a higher proportion of elderly individuals and a higher initial K-L grade (p < 0.001). The time required to proceed to the next K-L grade decreased as KL grade increased (Figure 3). Patient-specific data of Figure 3 were as follow: sex, male; age, 51; BMI, 28; occupation, manufacturer (mild physical demand); and comorbidity, not applicable. The average rates of progression from one K-L grade to another were: K-L grade 0 to 1, 10.05 years; K-L grade 1 to 2, 7.29 years; K-L grade 2 to 3, 5.26 years; and K-L grade 3 to 4, 4.15 years. BMI was the lowest in the fast-progression OA group (23.0 ± 2.8, post hoc p < 0.001), and osteopenia or osteoporosis was the most prevalent in the fast-progression OA group (p < 0.001). Patients in the fastprogression OA group tended to have a higher physical demand for occupation (p < 0.001) and a higher incidence of comorbidities, especially hypertension (p < 0.001).
ical demand); and comorbidity, not applicable. The average rates of progression from one K-L grade to another were: K-L grade 0 to 1, 10.05 years; K-L grade 1 to 2, 7.29 years; K-L grade 2 to 3, 5.26 years; and K-L grade 3 to 4, 4.15 years. BMI was the lowest in the fastprogression OA group (23.0 ± 2.8, post hoc p < 0.001), and osteopenia or osteoporosis was the most prevalent in the fast-progression OA group (p < 0.001). Patients in the fast-progression OA group tended to have a higher physical demand for occupation (p < 0.001) and a higher incidence of comorbidities, especially hypertension (p < 0.001). A detailed comparison of the outcomes of OA is presented in Table 3. Surgical intervention was performed on the knees within five years in 12.5% (312) of cases, more than five years in 17.6% (439) of cases, and conservative treatment was performed (not undergoing surgical intervention) in 69.9% (1741) of the cases. The mean age of patients who underwent surgical intervention was the highest in the within-five-years group (63.5 ± 7.0, post-hoc p < 0.001). Furthermore, participants who underwent surgical intervention within five years had the highest proportion of initial K-L grade 2 (49.0%) (153). Surgical intervention within five years was more frequent in the high physical demand occupation category (p < 0.001) and in patients with hypertension or diabetes mellitus (p < 0.001). A detailed comparison of the outcomes of OA is presented in Table 3. Surgical intervention was performed on the knees within five years in 12.5% (312) of cases, more than five years in 17.6% (439) of cases, and conservative treatment was performed (not undergoing surgical intervention) in 69.9% (1741) of the cases. The mean age of patients who underwent surgical intervention was the highest in the within-five-years group (63.5 ± 7.0, post-hoc p < 0.001). Furthermore, participants who underwent surgical intervention within five years had the highest proportion of initial K-L grade 2 (49.0%) (153). Surgical intervention within five years was more frequent in the high physical demand occupation category (p < 0.001) and in patients with hypertension or diabetes mellitus (p < 0.001).
The detailed performance of the model is summarized in Figure 2 and Table 4. The accuracy of the model for the progression rate of OA was 0.632, 0.616, and 0.644 for fast, usual, and slow progression, respectively, and the accuracies for the fate of OA were 0.874, 0.803, and 0.876 for the within-five-years, more-than-five-years, and conservative groups, respectively. A detailed comparison of the coefficients used to estimate the effect of each factor is presented in Table 5. Initial K-L grade and physical occupational demands were major contributors to both the progression rate and the fate of OA. BMD was the major contributor to the progression of OA. Hypertension was a major contributor to the outcome, but not the progression of OA. The progression rate and fate of OA differed according to the initial K-L grades, even in patients with similar conditions (Figure 4A,B). Patientspecific data of Figure 4A were as follows: (a) sex, female; age, 56; BMI, 25; occupation, no work (low physical demand); and comorbidity, not applicable; (b) sex, female; age, 59; BMI, 27; occupation, no work (low physical demand); and comorbidity, not applicable. Patient-specific data of Figure 4B were as follows: sex, female; age, 64; BMI, 29; occupation, serving worker (mild physical demand); and comorbidity, hypertension. respectively. A detailed comparison of the coefficients used to estimate the effect of each factor is presented in Table 5. Initial K-L grade and physical occupational demands were major contributors to both the progression rate and the fate of OA. BMD was the major contributor to the progression of OA. Hypertension was a major contributor to the outcome, but not the progression of OA. The progression rate and fate of OA differed according to the initial K-L grades, even in patients with similar conditions (Figure 4A,B). Patient-specific data of Figure 4A were as follows: (a) sex, female; age, 56; BMI, 25; occupation, no work (low physical demand); and comorbidity, not applicable; (b) sex, female; age, 59; BMI, 27; occupation, no work (low physical demand); and comorbidity, not applicable. Patient-specific data of Figure 4B were as follows: sex, female; age, 64; BMI, 29; occupation, serving worker (mild physical demand); and comorbidity, hypertension.     Among patients with the same initial K-L grades, the progression rate and fate of OA differed according to patient-specific situations, especially physical occupational demands ( Figure 5A,B). Patient-specific data of Figure 5A were as follows: (a) sex, male; age, 48; BMI, 26; occupation, office worker (low physical demand); and comorbidity, not applicable; (b) sex, male; age, 46; BMI, 28; occupation, construction worker (high physical demand); and comorbidity, not applicable. Patient-specific data of Figure 5B were as follows (white arrow head: chondral lesion; white arrow: complex tears and subluxation of medial meniscus): sex, male; age, 47; BMI, 26; occupation, farmer (high physical demand); comorbidity, not applicable. Among patients with the same initial K-L grades, the progression rate and fate of OA differed according to patient-specific situations, especially physical occupational demands ( Figure 5A,B). Patient-specific data of Figure 5A were as follows: (a) sex, male; age, 48; BMI, 26; occupation, office worker (low physical demand); and comorbidity, not applicable; (b) sex, male; age, 46; BMI, 28; occupation, construction worker (high physical demand); and comorbidity, not applicable. Patient-specific data of Figure 5B were as follows (white arrow head: chondral lesion; white arrow: complex tears and subluxation of medial meniscus): sex, male; age, 47; BMI, 26; occupation, farmer (high physical demand); comorbidity, not applicable.

Discussion
The principal findings of this study were as follows: several factors affecting the rate of remarkable progression and fate of primary OA were identified, and these were weighted by their degree of contribution to OA. Regarding the progression rate of OA, BMD, initial K-L grade, and physical occupational demands were major contributors, but initial K-L grade and physical occupational demands were more important contributors than BMD. Regarding the fate of OA, hypertension, initial K-L grade, and physical occupational demands were major contributors, but initial K-L grade and physical occupational demands were more significant than hypertension. The progression rate and fate of OA were more affected by the initial K-L grade and physical occupational demands (both were major contributors) than other factors. Furthermore, it was possible to predict the progression rate and fate of OA by considering patient-specific situations.

Discussion
The principal findings of this study were as follows: several factors affecting the rate of remarkable progression and fate of primary OA were identified, and these were weighted by their degree of contribution to OA. Regarding the progression rate of OA, BMD, initial K-L grade, and physical occupational demands were major contributors, but initial K-L grade and physical occupational demands were more important contributors than BMD. Regarding the fate of OA, hypertension, initial K-L grade, and physical occupational demands were major contributors, but initial K-L grade and physical occupational demands were more significant than hypertension. The progression rate and fate of OA were more affected by the initial K-L grade and physical occupational demands (both were major contributors) than other factors. Furthermore, it was possible to predict the progression rate and fate of OA by considering patient-specific situations.
In this study, the proportion of men was lower (overall percentage of female patients is 81.4% (2028 of 2492)) than those of other previous studies [10,11]. The reasons for this disparity might be as follows. (1) The compliance of women was better in the inclusion and exclusion conditions of this study than men. (2) As is well known, elderly women have a higher rate of OA than men on account of physiological factors [10]. The proportion of 55 years ≤ age was bigger than that of age < 55 years (67% in overall and 74% in progression rate of OA). Finally, (3) there were are many situations in daily lifestyle and occupation to perform, e.g., kneel and squat in Korean women [11].
It is generally suggested that OA progression is associated with obesity, which is positively correlated with BMI [17,18]. However, in this study, a lower BMI had a greater effect on the progression rate and fate of OA. The relationship between BMD and OA has been controversial in many studies. Several studies have reported that a higher BMD is associated with a greater risk of developing knee OA [19][20][21][22]. However, in our study, patients with osteoporosis and osteopenia tended to have a faster rate of OA progression and a higher rate of surgical intervention. Aging and biological interactions may be possible reasons for these contradictory results, which may be associated with longitudinal cartilage loss in the knees [20,21,23]. Furthermore, low BMI can cause low BMD. Low BMD can cause microfracture at the bone under the cartilage or a varus deformity as a result of bowing. In addition, the varus deformity is common in Asians, and these factors are thought to have worked in combination [24,25].
The initial K-L grade was the most important factor affecting the progression and fate of OA. It is generally reported that varus alignment and chondro-meniscal status are strong risk factors for OA progression [12,26]. Osteotomy for patients with meniscal tears may present an opportunity for early intervention in patients with varus alignment and symptomatic early knee OA to limit the progression of OA and protect meniscal lesions [27,28]. This suggests that the treatment of OA must be started early and adapted according to the diagnosed K-L grade and other associated conditions of the knee, such as alignment and chondro-meniscal status accordingly.
In lifestyle, kneeling and squatting are considered the most important risk factors for knee disorders in many studies [6,26,29]. We classified physical occupational demands by modifying the reports of Palmer et al. and Reid et al. [14,16]. In our study, higher physical occupational demands were related to the progression rate and fate of OA, which corresponded well with results from previous reports [14,30]. Epidemiological studies have reported a positive association between OA and several comorbidities, such as hypertension and diabetes mellitus. Comorbidities are commonly associated with obesity, dyslipidemia, and hyperglycemia [12,26,31]. In this study, hypertension and diabetes mellitus tended to have a higher prevalence in patients with faster OA progression and earlier surgical intervention.
In this study, the hypertension and diabetes mellitus did not contribute to the progression rate and the fate of OA more than the initial K-L grade and physical occupational demands. This does not mean hypertension and diabetes mellitus were not related to OA. Hypertension and diabetes mellitus was also a contributor to the progression rate and the fate of OA. Each metabolic disease was already studied about its relationship with knee OA in some reports. In many studies, hypertension was associated with knee OA. Furthermore, hypertension was related to higher OA knee pain severity [32][33][34]. Our results also agree with those of studies. Hypertension was a major contributor to the fate of OA. It can be thought that hypertension is related to severe OA. Louati et al. reported a higher risk of knee OA significant exists in the diabetes mellitus population but causality was not yet clearly demonstrated [9]. However, some other meta-analyses do not support diabetes mellitus as an independent risk factor for knee OA. Rather, BMI was a more important confounding factor than diabetes mellitus [35,36]. Similarly, in our report, diabetes mellitus was a minor contributor to both the progression rate and the fate of OA.
Although many studies have attempted to develop a treatment plan for OA, decision making for the management of early OA is still ambiguous, even though the importance of the appropriate management of early OA has been emphasized. The reasons may be that many factors can affect the progression rate and fate of OA, and it might be difficult to identify the determining factors. CDWs provide serial data on various disease courses for OA, which makes it possible to analyze them [15,37]. The important difference between our method and those in previous reports is that our method leverages individualized patient-specific situations that are easy to obtain, such as occupation and comorbidities, and it tries to analyze them over time. In addition, we used patient-specific data acquired at a single institution, which could provide strong and consistent data to predict the progression rate and fate of OA. Our findings can be meaningful in supporting clinicians in planning treatment for patients with no radiologic or mild OA with knee pain using various methods, including pharmaceutical ones [38,39].

Limitations
This study had several limitations. First, the proportion of men was lower compared to other previous studies [10,11]. However, it was uncontrollable and we could not find a specific reason that show this disparity. Secondly, we assumed K-L grade 2 and higher change as the same situation, because we focused on the rate for remarkable progression. In addition, our purpose was to find variables of remarkable progression for supporting clinicians in treatment planning. Thirdly, we studied a comprehensive analysis of patientspecific situations affecting the progress of OA and the fate of OA. Thus, this study can be limited in explaining the biological relationship between each factor and OA. Furthermore, in cases in which both knees were included (814 patients), there may be a lack of explanation about why the degree of OA in both knees differs. It might be the characteristics of the occupation and the differences of sides mainly used. Finally, each comorbidity included in the "other disorders" was too heterogeneous and their numbers were limited.

Conclusions
In identifying several contributing factors, the initial K-L grade and physical occupational demands were the most important factors. BMD and hypertension were also major contributors to the progression and fate of OA, and the degree of contribution was lower compared to the two major factors.

Informed Consent Statement: Not applicable.
Data Availability Statement: Dr. Yong Seuk Lee (smcos1@hanmail.net) had full access to all data in the study and takes responsibility for the integrity of the data and accuracy of the data analysis.

Conflicts of Interest:
The authors declare no conflict to interest.

Appendix A. A Detailed Explanation of Algorithms and Models
Appendix A.1. Model Explanation Logistic regression P Y = 1 X 1 , . . . , X p = e β 0 + β 1 X 1 + . . . + β p X p 1 + e β 0 + β 1 X 1 + . . . + β p X p Y is indicated the binary response variable of prediction target and X 1 , . . . , X p the random variables considered as patient-specific variables, termed features in this study. The LR model links the conditional probability P Y = 1 X 1 , . . . , X p to X 1 , . . . , X p through where β 0 , β 0 , . . . , β p are regression coefficients (maximum-likelihood). The probability that Y = 1 for a new instance is then estimated by replacing the β's by their estimated counterparts and the X's by their realizations for the considered new instance. The new instance is then assigned to class Y = 1 if P(Y = 1) > c, where c is a fixed threshold, and to class Y = 0 otherwise (c = 0.5).
SoftMax ( (1) with the maximal result getting the largest portion of the distribution, but other smaller elements get some of it as well. This effect of the SoftMax function, in that it outputs a probability distribution, makes it appropriate for the classifier of probabilistic tasks.

Cross-Entropy Loss (CE)
Cross-entropy controls the distance between what the model believes the output distribution should be, and what the real distribution is. Cross-entropy measure is a widely used alternative to squared error. It is used when output can be understood as representing the probability that each hypothesis might be true. Thus, it is used as a loss function in the output layer.